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Abstract. Combining the properties of monovariate internal functions 
as proposed in Kolmogorov superimposition theorem, in tandem with the 
bounds wielded by the multivariate formulation of Chebyshev inequal- 
ity, a hybrid model is presented, that decomposes images into homoge- 
neous probabilistically bounded multivariate surfaces. Given an image, 
the model shows a novel way of working on reduced image representation 
while processing and capturing the interaction among the multidimen- 
sional information that describes the content of the same. Further, it 
tackles the practical issues of preventing leakage by bounding the growth 
of surface and reducing the problem sample size. The model if used, also 
sheds light on how the Chebyshev parameter relates to the number of 
pixels and the dimensionality of the feature space that associates with 
a pixel. Initial segmentation results on the Berkeley image segmenta- 
tion benchmark indicate the effectiveness of the proposed decomposition 
algorithm. 
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1 Introduction 

In order for an image to be decomposed, a proper representation of the im- 
age must first be done. One set of solutions for image representation is the 
decomposition of multivariate functions into monovariate functions as proposed 
by Kolmogorov superimposition theorem (KST) [1]. Sprecher et.al [2] have also 
proved that the monovariate internal functions obtained via the KST can be 
used to build space filling curves that sweep a multidimensional space. These ID 
representations of the image can then be exploited for further processing using 
simple univariate or bivariate signal processing methods, as has been shown in 
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[3] and [4] . It has further been proposed in [4] that either the space filling curves 
can be fixed and then construct external function whoes sum and compositions 
correspond to a multivariate function [2] or produce an algorithm that gener- 
ates the internal function that adapts to the multivariate function and gives 
different space filling curve for different multivariate functions [5]. The work in 
this manuscript finds its motivation in presenting an intial hybrid model that 
uses a fixed space filling curve for an image and differs in employing mutivariate 
Chebyshev inequality on the M dimensional points lying on the curve, to yield 
homogeneous probabilistically bounded multivariate surfaces. 

From the theory of space filling curves, it is known that the Hilbert Space 
Filling Curve (HSFC) [6] is the best in preserving the clustering properties while 
taking into account the locality of objects in a multidimensional space ([7], [8]). 
Even though it can be applied to transform a multidimensional image represen- 
tation into a linear format ([3], [4]), the manuscript applies the HSFC to trans- 
form a 2D matrix into ID space filling curve. The reason being that it saves the 
time and avoids the complexity in processing a 2D matrix in comparison to ATD 
matrix. 

Next, a multivariate formulation of the generalized Chebyshev inequality [9] 
is applied to decompose the image into surfaces bounded probabilistically via a 
single Chebyshev parameter. Since the bounded surfaces are constructed based 
on the interaction of a set of points in VJ^ lying on the curve, it can be safely 
assumed that information about the nature or density of surface in a locality 
gets captured in these tiny patches. For example, RGB images from the Berkeley 
Segmentation Benchmark (BSB) [10] have been taken into consideration for the 
current study. In the case of a single RGB image, three dimensions exist in the 
colour map. These three form a feature set {M = 3). 

Several advantages arise with the use of this new hybrid model, namely: • 
Faster processing of the image on ID compared to analysis of neighbourhood 
information per pixel. • Generation of homogeneous surfaces of different sizes 
that are bounded probabilistically, by inequality. • Leakage problem gets avoided 
due to conservative nature of inequality. • Reduction in problem sample size by 
a factor e (Chebyshev parameter). • The density of image gets investigated via 
the multivariate formulation of the inequality. • The generalized hybrid model 
adapts to multivariate information while traversing on a ID fixed curve. 

Hitherto, a brief description of the state of the work has been covered. In 
section 2, the theoretical aspect of the method is dealt in a greater detail with a 
toy image example. Experiments section 4 deals with the empirical evaluations 
conducted on BSB. Lastly, the conclusion follows in section 6. 

2 Theoretical Perspective 
2.1 Hilbert Space Filling Curve 

The space filling curves form an important subject as it helps in transforming 
a multidimensional dataset into a linear format. This comes at a price of losing 
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Fig. 1. Hilbert Space Filling Curve for grid of size 2 (top left), 4 (top right), 8 (bottom 
left) and 16 (bottom right). 



some amount of information, but the merits of preserving the local properties 
while transforming the objects in multi dimension to single dimension out weigh 
the incurred cost. The HSFC is a fractal filling curve proposed by [6] which 
fills the space of 2D place in a continuous manner. Analytical results found in 
[8], [7] and [16] prove the optimality of results obtained while using HSFC. In 
the current formulation a matlab implementation of [17] is used to generate the 
HSFC for 2D matrices. 

Figure 1 shows the space filling curve for the grids of size 2, 4, 8 and 16 
respectively. Note that the curve covers each and every point on the integer grid 
once while taking into account the local properties. It is not that the HSFC 
does not work for rectangular matrices, but the analysis of cluster preserving 
properties of the same becomes asymptotic in nature rather than being exact, 
as has been proved in [8]. Given an ATD image, the HSFC is generated which 
remains invariant of the same. 



2.2 Multivariate Chebyshev Inequality 

Let X be a stochastic variable in Af dimensions with a mean 22 [X]. Further, 
U be the covariance matrix of all observations, each containing M features and 
e G then the multivariate TChebyshev Inequality in [9] states that: 

V{(X - E[X]) T E- 1 {X - E[X}) > e} < ^ 

V{(X - E[X]) T U-\X - E[X]) < e} > 1 - — 

(1) 
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Fig. 2. Starfish image from [10] and the 64 x 64 block under consideration. 



i.e. the probability of the spread of the value of X around the sample mean 
E[X] being greater than e, is less than Af/e. There is a minor variation for 
the univariate case stating that the probability of the spread of the value of x 
around the mean \i being greater than ecr is less than 1/e 2 . Apart from the minor 
difference, both formulations convey the same message about the probabilistic 
bound imposed when a random vector or number X lies outside the mean of the 
sample by a value of e. 

In a broader perspective, the goal being to demarcate regions based on sur- 
faces, two questions need to be addressed regarding the decomposition of image. 
• Which two pixels or their corresponding ATD vectors be selected to initialize 
a surface depicting near uniform behaviour? • What should be the size of such 
a restricted surface? 

2.3 Initializing Surface 

The solution to first question would help in initializing a surface. A pair of 
vectors in A/"D will swap a flat plane with an angle subtended in between the 
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two vectors. Given that the dot product exists in the higher dimensional plane, 
the cosine of the angle between the two vectors would suggest the degree of 
closeness between them. If a and b are two such vectors in A/T), then the degree 
of closeness is given by: 

cosineO = , , ^ Q ' ^ , (2) 

ll a l|2 X ||6|| 2 

were, < a, b > is the dot product and the denominator contains the 2-norm 
terms of both vectors. It is well known that the absolute value of cosineO tends 
to 1 (0) as vectors tend to be nearly parallel (perpendicular). Let u, v and w 
be three consecutive pixels on the HSFC. If the cosine of the angle between u 
and v evaluates to an absolute value greater than a nearness threshold Npar 
(say 0.95) then the pair is considered as a valid surface. Note that Npar is the 
nearness parameter which is used as a threshold to decide the degree of closeness 
of two pixels for forming a surface. If not, then u is left as a single point in ATD 
and the closeness criterion is checked for v and w (and the process is repeated). 



2.4 Size Of Surface 

Solving the second question shall define the range of the surface. Once the start 
and end points (say v and w respectively) of a valid surface have been set, the 
size of the surface has to be determined. The size of the surface would constitute 
all points that contribute towards uniform surface behaviour in A/T). This degree 
of uniformity is controlled via the Chebyshev's Inequality. The idea is executed 
as follows: The next consecutive point (say t) after w is considered for surface 
extent analysis. If the spread of surface point t from E (surf ace) the mean of 
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the existing surface [v,w], factored by the covariance matrix, is below e, then t 
is considered as a part of the surface and marked as a new ending point of the 
surface. Using Chebyshev's Inequality, it boils down to: 



V{(t - E([v, w]) T U- 1 (t - E([v, w}) > e} < A 



e 



V{(t - E([v, w]) T U-\t - E([v, w])<e}>l-^ 



(3) 



were £ is the covariance matrix between the J\fD vectors constituting the initial 
surface. Satisfaction of this criterion leads to extension of the size of initial 
surface by one more point i.e. t. The surface now constitutes with v 

and t as start and end marker points. If not, the size of the surface remains as 
it is and a fresh start is made starting with t and the next consecutive point 
on the HSFC. The satisfaction of the inequality also gives a lower probabilistic 
bound on size of surface by a value of 1 — (A/"/e), if the second version of the 
Chebyshev formula is under consideration. 

The above formulation implies that when a homogeneous patch is encoun- 
tered, then the a new point does not deviate much from the initial surface. Thus 
the size of the surface grows smoothly. For a highly irregular patch, the surface 
size may be very restricted due to high variation of a pixel from the surface it is 
being tested in the vicinity. Figure 2 shows the tiny patch (64 x 64) of starfish 
under consideration and figure 3 shows the HSFC generated over the area of the 
image. 

The tiny restricted surfaces generated using the multivariate and the uni- 
variate formulation of the Tchebyshev inequality for e = 3 are shown in figure 4 
and 5. Note how the surfaces differ due to the multivariate and univariate for- 
mulation of the inequality. The former takes into account the entire J\fD vectors 
in tandem to compute the covariance or the texture interaction and the mean, 
while the later computes the inequality separately for each and every dimen- 
sion. The different colours just indicate the different surfaces and has nothing to 
do with clustering at this stage. The potentiality of the method gets highlighted 
due to the bounded surfaces that have been obtained after image decomposition. 
This boundedness is checked via the parameter which determines the degree of 
intricacy to which the texture interaction is to be taken into account. It should 
be noted that in the univariate case the decision is made on the majority vote 
after M evaluations of the inequality on the space filling curve for a single pixel. 
What paves way is that the univariate formulation does not capture the interac- 
tion which later leads to low grade segmentation results compared to that given 
by the formulation. These differences are apparent in the figures mentioned at 
the starting of the paragraph. Figure 6 and 7 shows the image patch decomposed 
into surface patches for different e values. 
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Fig. 4. Patch of 64 x 64 image of starfish from [10], decomposed using the multivariate 
formulation of Chebyshev inequality with e = 3 and Npar = 0.95. A coloured line 
shows the pixels associated to a single bounded surface on the HSFC. 




Fig. 5. Patch of 64 x 64 decomposed using the univariate formulation of Tchebyshev 
inequality with e = 3 and Npar = 0.95. A coloured line shows the pixels associated to 
a single bounded surface on the HSFC. Note that size of surface is decided based on 
voting across J\f evaluations. 



2.5 Implications 

The inequality being a criterion, the probability associated with the same gives 
a belief based bound on the satisfaction of the criterion. This gives rise to certain 
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Fig. 6. Decomposed surfaces using multivariate inequality with Npar = 0.95 and for 
e equal to 4 (top left), 8 (top right), 16 (bottom left) and 32 (bottom right). 




Fig. 7. Decomposed surfaces using univariate inequality with Npar = 0.95 and for e 
equal to 4 (top left), 8 (top right), 16 (bottom left) and 32 (bottom right). 



simple implications as follows. Let V be a decomposition which is equivalent to 
(X t - E[X]) T E-\X t - E[X]). Then: 

Lemma 1. Decompositions V are bounded by lower probability bound of 1 — 
(A/Ye) given that e > M . 
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Not only does it hold true for e > Af but also for e < Af. But the probability 
being greater than a negative value is always true and thus e = Af forms the 
lower bound. As e — » oo, V(V < e) — >■ 1. 

Lemma 2. T/ie mfo/e o/ e reduces the size of the sample from Ai to an upper 
bound of Ai/e probabilistically with a lower bound of 1 — (Af /e). Here Ai is the 
number of pixels in a 2D matrix. 

This holds true as the image is decomposed into surfaces which are probabilisti- 
cally bounded via the Chebyshev inequality. This decomposition leads to reduc- 
tion in the sample size by a factor of e while retaining the information content, 
kudos to the space filling curve traversal. 

Lemma 3. As e — >■ Af the lower probability bound drops to zero, implying large 
number of small decompositions V can be achieved. (Vice versa for e ^ oo) 

This gives an insight into the degree to which the image can be decomposed. 
Where finner details are of import, one may use values of e tending to Af and 
vice versa. 

Theorem 1. Let image X contain At pixels, with each pixel having Af features 
. If X can be decomposed in £ = Ai/e bounded surfaces via the proposed hybrid 
model, then in case of the decompositions having equally likely probabilities: (a) 
At = e 2 /(e — A/") and (b) e G open interval (Af,M). 

Proof. Since the X can be decomposed into £ bounded surfaces, it is known 
that the decompositions are disjoint sets. Let Vj (for j G {1,^}) be such 
decompositions. Then X = \Jj =1 Vj. Considering X as the universal set, we 

get: V(X) = V({J £ j =1 T>j) = X)j=i ^O^j)- From lemma 1, it is known that 
V(Vj < e) > 1 — (jV/e). In the case that the decompositions are equally likely, 
the lowest probability for each Vj evaluates to 1 — (Af/e). 

Thus, Y?j=i V(Vj) = £(l-M/e) = (M/e)(l-M/e). Since V(X) = 1 and V(X) 
= ^ =1 ^(^j)j it implies that (A4/e)(l — JV/e) = 1. Simplifying the foregoing 
formulation leads to A4 = e 2 /(e — A/"). This proves the part (a) of the theorem. 

From part (a), it can be clearly seen that e — Af ^ 0, lest part (a) would 
be invalid. Thus e ^ Af. Similarly, if e = Ai, then on simplification of part (a) 
evaluates to Af = 0. Again this cannot be the case as an image will have atleast 
one feature in terms of intensity. Thus e ^ At. Let k > 1, then for e = Af /k part 
(a) shows that the ratio of Ai /Af is a negative quantity. This cannot be possible 
as both numbers in the ratio are positive numbers. Thus e > Af. Again, for the 
same condition of fc, if e = kAi, part (a) on evaluation shows the ratio of Ai /Af 
to be a negative quantity. This again is a contradiction given the state of Ai and 
Af. Thus e < At. Thus e is strictly bounded in the open interval (A/", At). This 
finishes proof for part (b). □ 

Theorem 2. If an image X is decomposed into bounded surfaces s.t. the latter 
have equally likely probabilities, then theoretically the maximum value of both e 
(Chebyshev parameter) and Af (number of feature dimensionality ofX) is of the 
order of \[AA. 
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Fig. 8. Image with 5 clusters using &;-means (cityblock distance, 100 replicates and 
1000 iterations) on decomposed surfaces generated via multivariate inequality with 
Npar = 0.95 and for e equal to 4 (top left), 8 (top right), 16 (bottom left) and 32 
(bottom right). 



Proof. From theorem 1, it is known that A4 = e 2 /(e— A/*), when the probabilities 
of the decomposed surfaces are equally likely. It is also known that Af < e < A4. 
Let k > 1 and e decrease harmonically via e = A4/k for k from 2 onwards. Then 
simplifying part (a) of theorem 1 gives AAjAf = k 2 /(k — 1). 

Now, if e = AAjAf , then M/k = k 2 /(k — 1). The fraction on the right hand 
side can be segregated into complete and partial fractions as k + k/(k — 1). It is 
known that fc + 1 < fc + fc/(fc-l) < & + 2. Then + 1 < .M/fc < & + 2. Equating 
for both inequalities around M/k, the value of k lies between yjjl + A4) + 1 
and + — l)/2. Taking the order of maximum value of k as \f(A/l), e 

= M/k = y^M) and Af = k = yf(M). □ 



Theorem 1 shows how the sample size A4 is related to the dimensionality of 
feature space Af via the Chebyshev parameter e. Implicitly, it also states that 
the sample size A4 must be greater than Af and the the value of e lies in the open 
interval (Af, A4). These tight bounds in an idealistic case show that the model is 
effective in a theoretical sense. The second theorem builds on the first and shows 
that the maximum theoretical value of the parameter e is of the order of square 
root of the sample size A4 and so is the dimensionality of the feature space Af. 
In an ideal case of equal likelihood, this maximum value gives an upper bound 
on e as well as Af, such that the decompositions are uniformally spread in the 
higer dimensional space. 
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Fig. 9. Images from [10] segmented for Npar = 0.95 and for e equal to 4, 8,16 and 32, 
from left but one to right with number of clusters 5. 



2.6 Clustering On Bounded surfaces 



Once the image has been decomposed into bounded surfaces, segmentation of the 
image is done on the average value of surfaces via /c-means algorithm in Matlab 
with a certain number of pre-defined clusters. The cityblock distance is used as 
a metric for the kmeans and the number of replicates is of the order of 100 with 
1000 iterations for the /c-means. The reason for using a high number repilcates 
and iterations is to avoid getting stuck in local solutions. The average values are 
computed by taking the mean of the A/"D points that constitute the bounded 
surfaces. These values are considered to be robust as the surfacess themselves are 
bounded on the first place probabilistically taking into account the variability 
in intensity behaviour. 

Figure 8 shows the corresponding result for different values of e. The figure 
shows that the quality of segmentation degrades as the value of e increases, which 
increases the surface size. For example, the groves on the starfish are captured 
in greater details for e = 4 than for higher values of the same. A few sample 
images from [10] for which segmented images were generated over different values 
of e G {4, 8, 16, 32} have been presented in figures 9 and 10. The number of 
clusters was predefined to be 5. With increasing size of e values, the amount 
of decompositions reduce which later affect the quality of the segmentation. 
This is apparent in the figures as one moves from left to right. The clustering 
on these bounded surfaces is not only good but also less time consuming as the 
segmentation are done on a reduced sample size while still retaining the crucial 
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Fig. 10. Images from [10] segmented for Npar — 0.95 and for e equal to 4, 8,16 and 
32, from left but one to right with number of clusters 5. 



pieces of information. Not that the solution is the best, but results tend to be 
good quality at first sight. 

3 Decomposition Algorithm 

Algorithm 1 and its continuity in 2 shows the implementation for the decompos- 
ing the image into probabilistically bounded surfaces based on space filling curve 
traversal. The depicted version is for multivariate formulation of the Tchebyshev 
inequality. Minor change in the form of univariate formulation used separately 
with each of the M dimensions would lead to univariate version. 

Note that the output of the decomposition algorithm is a list of bounded 
surfaces. Many features could be developed but in this manuscript mean values 
of all the A/T) points constituting a surface is taken as a feature vector. This is 
because the inequality measures the degree of homogeneity of density interaction 
in A/T>. 

Another point to be aware of is the computation of the measure in multivari- 
ate Chebyshev inequality in equation 3. Since it requires presence of inverse of 
the covariance matrix, one may run into problem of sparseness, or inappropriate 
dimensionality between the number of samples and features. To overcome these 
problems, the pseudo inverse was computed using the singular value decompo- 
sition implementation in matlab. Being fast and effective, the decomposition of 
the image into bounded surfaces works within a matter of seconds. 
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Algorithm 1 Tchebyshev Surfaces 



1: procedure TcHSuRF(zmg, e, Npar, formulation) 
2: [nrows, ncols] ^— size(img) 

3: [R,S] <(— hlbrtcrv (nrows, ncols) > HSFC coords 

Initialize Variables 

4: marked V ertex <— [] > Marked pixels per surface 

5: surfaceu s t ^— > List of surfaces 

6: featureu s t <— > List of features per surface 

7: surface no ^—0 > Count of surfaces 

Generate bounded surfaces on HSFC 
8: Sidx ^—1 > Index of pt where the curve starts 

9: while Sidx < I en — 1 do > len is length of curve 

10: 

11: [rs, cs] <- [R(sidx), S(s idx )] 

12: [re, ce] <- [R(e idx ), S(e idx )] 

13: pixinfo ^— struct (); > Initially two pixels 

14: pixinfo-loc <(— [rs, cs; re, ce] > store locations 

Store intensity values per pixel 
15: surface <(— [] 

16: [lenidx, cols] ^— size(pixi n / ./oc) 

17: for i = l: lenidx do 

18: [r, c] «- [pix in f .loc(i, 1), pix in f .loc(i, 2)] 

19: temp <(— [] 

20: for j = 1 : jV do > Features 

21: temp <(— [temp; img(c, r, j)] 

22: end for 

23: surface <(— [surface, temp] 

24: end for 

Compute nearness between two initial pixels 
25: innerprod <(— dot (si£r/ace(:, 1), surface^., 2)); 
26- rnwnl < inner P rod 

z,u. uuauut v- norm ( sur ; ace (. )1 ) )2 )xnorm(s«r/oce(:,2),2) 

Check cosineO greater than nearness param 
27: if cosval > Npar then 

28: 

29: while e^ < /en do 

Store next pixel on HSFC 

30: [re, ce] <- [i?(e^), ^(e^)] 

31: pixf? c d <- [re, ce] 

Store intensity values for pixel 
32: neujpix > [J 

33: for j = 1 : do 

34: newpi x <(— [new P i X ] img(ce, re, j)] 

35: end for 
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Algorithm 2 Tchebyshev Surfaces Continued 
Tchebychevs inequality for fixing length of surf 
NOTE - rows are dimension and cols are pixels in surface matrix 



36: surfacemu <(— /j(sur f ace T ) 

37: surfacestd <(— std(sur f ace T ) 

38: covR <(— cov(surface T ) 

39: dev <(— (new P i X — surfacemu T ) 

Compute inequality in equ 3 

40: criterion ^— dev T * pinv (covR) * dev 

41: decision <(— 

42: if criterion < e then 

43: decision ^— decision + 1 

44: end if 

45: if decision > 1 then 

46: surface <(— [surface, new P i X ] 

47: pixinfo-locir- [pixinfodoc] pixioc] 

48: 6ieZcc ^ Czdcc H~~ 1 

49: else 

50: break 

51: end if 

52: end while 

Store the surfaces, associated pixels and features 

53: surfaceno ^— surface no + 1 

54: sur f aceu s tsur f ace no <— pixi n f -loc 

Mean per dimension as feature for a surface 

55: featval ^— [] 

56: for % = 1 : M do 

57: featval <(— [featval] fi(sur f ace(i , :))] 

58: end for 

59: f eatureu s t {sur f aceno} ^— featval 
60: 

61: else > If only one pixel is a surface 

62: pix in f .loc(2, :) «- [] 

63: surface(:,2) <(— [] 

64: surface no ^— surface no + 1 

65: surf aceu s t{surf ace no } <- pix in fo-loc 

Mean per dimension as a feature for just one pixel 

66: featval ^— [] 

67: for % = 1 : N do 

68: featval <(— [featval] fi(sur f ace(i , :))] 

69: end for 

70: f eatureu s t{sur f ace no } ^— featval 

71: + 1 

72: end if 
73: end while 
74: end procedure 




Fig. 11. Row wise top to bottom: Original image, probabilistic boundaries for human 
segmentation, mTch (Npar = 0.95, e = 4 and number of clusters 10), EM (number 
of clusters 10), nCuts (number of clusters 10), grBase (a = 0.8, k = 300) and mShift 
(^ = 8,^ = 8). 
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Fig. 12. Row wise top to bottom: Original image, probabilistic boundaries for human 
segmentation, mTch (Npar = 0.95, e = 4 and number of clusters 10), EM (number 
of clusters 10), nCuts (number of clusters 10), grBase (a = 0.8, k = 300) and mShift 
(^ = 8,^ = 8). 
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Rank 


J 7 - Score 


Algorithm 





0.79 


Humans 


1 


0.63 


Expectation Maximization (EM) 


2 


0.59 


Multivariate Chebyshev (mTch) 


3 


0.58 


Mean Shift (mShift) 


4 


0.50 


Normalized Cuts (nCuts) 


5 


0.28 


Graph Based (grBase) 



Table 1. Summary table of rank of algorithms based on .F-score generated on 100 test 
images in the BSB dataset [10]. 



4 Experiments 

To test the effectiveness of the multivariate Chebyshev algorithm (mTch), the re- 
sults obtained on image segmentation were compared with some of the standard 
existing algorithms. The other algorithms employed were the normalized cuts 
(nCuts) [11], [12], edge graph based segmentation (grBase) [13], the mean shift 
clustering (mShift) [14] and an expectation maximization (EM) based implemen- 
tation [15]. Figure 13 shows the standard peppers image that was segmented into 
10 clusters for mTch, EM and nCuts. For the case of grBase (a = 0.8, k = 300) 
and mShift (h r = 8, h s = 8), parameters values were specified as mentioned in 
literature. Matlab implementations of grBase and mShift were taken from [18] 
to produce the results. All algorithms were tested with a fixed parameter value 
on images with varying content. This does imply that results per image may not 
be optimized and would sound a bit unfair, but from another perspective such 
an experiment also suggests how robust an algorithm is against the variance in 
content of images, given a fixed parameter value. This outlook holds true when 
both the grBase and mShift algorithms in this experimental setup donot fair 
well on fixed parameter value for all images as compared to the proposed mTch 
algorithm. The ^-scores later generated bolster this claim. 

Probabilistic boundaries were generated based on brightness and texture gra- 
dients [10] on the segmented images generated from the algorithms under study 
as well as the available human segmentation. The boundaries were later evaluated 
to find ^-scores. In many of these images, the segmentations based on bounded 
surfaces gave the good and consistent results. Table 1 shows the summary of 
^-scores by the algorithms onthe benchmark. It states that the segementations 
from the bounded surfaces gave better results than some of the standard well 
known algorithms. The mTch works second best to EM. A reason for low per- 
formace of mTch w.r.t EM may be due the ignorance of important features like 
surface gradient and reflectance propertices that may add to more discrimina- 
tive information than using average values of intensities at the current stage. 
Although the EM gives good performance many times, there are places where it 
does not capture the nature of surface well. One such case is the texture of the 
purple cloth and wrinkles in it, in the peppers image (figure 13). For more such 
cases, the generated results can be made available. 
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Fig. 13. (a) Original and segmented images from (b) mTch (Npar = 0.95, e = 4 and 
number of clusters 10), (c) EM (number of clusters 10) (d) nCuts (number of clusters 
10) (e) grBase (a = 0.8, k = 300) and (f) mShift (h r = S,h a =S). 



It is widely known that the mShift yields good segments such that no local in- 
formation is left behind. Too much information sometimes may not be necessary 
while segmentation are been compared. Figures 11 and 12 show the boundaries 
generated using the routines in the benchmark. Note that in these images, the 
mTch gave the best results. Also, at first sight it may appear that the boundaries 
may not have been generated well in case of mShift and grBase, but this is not 
the case as careful investigation does show their presence. 

The grBase [13] works on the basis of a predicate that measures the evidence 
of a boundary on two ideas, namely: the comparison of intensity differences 
across boundary and the other, intensity differences among neighbourhood pix- 
els within a region. For this a threshold function is proposed which depends 
on the components and their respective size. This also means that the parame- 
ter used for scaling components would be different for different images, if good 
segmentation results are desired in grBase. In comparison, the mTch decom- 
poses the images into surfaces which are bounded surfaces or components that 
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preserve the homogeneity in texture using the image content invariant HSFC 
and the multivariate measure that captures the local neighbourhood interaction 
in the Tchebyshev inequality. Thus the constant value of parameters for mTch 
would give good results for different images most of the time. Results in the 
benchmark dataset prove the issue that for same values of paramters in grBase 
the segmentation results are inferior to that of mTch of a sample of 100 test 
images. 

It must be noted that the parameter k used in [13], scales the size of the 
component and is not the minimum component size. Thus it is taken as a con- 
stant and no relations of it are derived with respect to the interaction present in 
the image. Though on similar lines, e in mTch is also a parameter that defines 
the degree of control over size of components, but with a bound. The e charac- 
terizes the size of a surface or component probabilistically, while relating to the 
control of the degree of texture interaction using the mutivariate measure in the 
Tchebyshev inequality (equation 3). This probabilistic bound is the key relation 
between the size of component and the texture interaction within, that decom- 
poses the image into homogeneous surfaces in AfD by lemma 3. Clustering on 
a bunch of homogeneous surfaces is bound to give robust segmentation results. 
The price that is paid is in terms of time required to cluster the surfaces using 
the standard /c-means. 

Figures 11 and 12 show the boundaries generated using the routines in the 
benchmark which are later evaluated to generate ^-scores that determine the 
accuracy of the proposed algorithm as well as that of the other algorithms. 
Note that in these images, the mTch gave the best results. Also, at first sight 
it may appear that the boundaries may not have been generated well in case of 
mShift and grBase, but this is not the case as careful investigation does show 
their presence. Figure 14. a and 14. b shows the precision recall curve for (EM, 
mTch) and (mTch, mShift), respectively. Figure 15. a and 15. b shows the curve 
for (mTch, nCuts) and (mTch, grBase), respectively. 

5 Discussion 

The proposed method has advantages as well as disadvantages. This section gives 
an analysis of the intricate points of the algorithm as well as hints as to where 
improvements can be made. 

Neighbourhood Information is implicitly considered using a HSFC traversal. 
The topography of surface is captured by treating the M dimensions of the image. 
For RGB it would be M = 3. 

Filtering of image is not required unless absolutely necessary. The selection 
of surfaces and checking their validity based on variation of points in A/T) via 
Tchebyshev Inequality, makes the operation of filtering computationally redun- 
dant. Especially the variation or standard deviation in the A/T) points is the 
measure that tolerates the amount of noise that can be taken into account. If 
the variation is too high, then the initialized surface may be invalid for further 
processing due to excessive noise. 
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Fig. 14. Precision Recall curve for (a) EM (number of clusters 10) and mTch (Npar — 
0.95, e = 4 and number of clusters 10) and (b) mTch (Npar = 0.95, e = 4 and number 
of clusters 10) and mShift (h r = 8, h s = 8). 



The Leakage problem is that it is not known when to stop to determine the 
size of the area. This is tackled by determining the length of the surface via 
the use of the Tchebyshev's Inequality, instead of using thresholds based on 
image intrinsic intensity values. One aspect that affects the performance of the 
algorithm is the /c-means clustering which may require several iterations as well 
as replicates in order to converge and produce clusters without getting stuck in 
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Fig. 15. Precision Recall curve for (a) mTch (Npar — 0.95, e — 4 and number of 
clusters 10) and nCuts (number of clusters 10) (b) mTch (Npar = 0.95, e = 4 and 
number of clusters 10) and grBase (a = 0.8, k = 300). 



some local minima. In general terms, if one is not bogged down by the intricacies 
of the /c-means then the whole framework works well on surfaces and gives nice 
segmentation results on the benchmark. The /c-means on pixels is slower than the 
fc-means on the surfaces themselves as the sample size of the former is reduced 
by a factor of e while retaining the local properties using the Hilbert space filling 
curve. 
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The current version of the algorithm also does not optimize the values of e 
in equation 3 and Npar the initialization parameter. Silhouette validation [?] 
could be one of the methods employed to find the best value of k clusters. The 
current research does not focus on the number of clusters per se. Instead of 
focusing as optimisation problem, Npar and e act as parameters of degree of 
control over the inclusion of points for surface initialization and size of the sur- 
face. The experiments prove that the initial model, without much tunning gives 
comparable results for segmentation purpose. Robust performance across images 
with varying context point towards the benefits of using a hybrid model that 
currently uses a fixed Chebyshev parameter value, a simple measure of similarity 
and a fixed space filling curve. Intuitively, it can be infered that clustering on 
these probabilistically bounded homogeneous surfaces will be faster than clus- 
tering on pixels. This is because the homogeneous intensity values gets bunddled 
up together and reduces the sample size of the original problem by a factor of 
e (Chebyshev parameter). Because of its generalized framework, the proposed 
decomposition algorithm can find its application in areas like the generation 
of textures, combining information from multimodal sources as in biomedical 
images and processing multidimensional information on space filling curves, to 
name a few. 

6 Conclusion 

A novel hybrid model for image decomposition has been proposed. The model 
works on reduced image representation based on monovariate functions and pro- 
cesses information spread in a multidimensional framework. Initial segmentation 
results indicate the efficacy of the tool in terms of generalisation and robustness 
across images with varying content. 
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